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(57) ABSTRACT 

A method places procedures of an application program in a 
memory in order to maximize performance. The application 
program is first mapped to non-executable addresses of the 
memory. A segment of the memory large enough to store the 
program is allocated as executable. The procedures are then 
copied from the mapped non-executable addresses to the 
executable segment as the procedures are executed. The 
procedures are copied in the order that they are executed. 
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METHOD FOR DYNAMICALLY PLACING However, profile-based placement has two problems. 

PROCEDURES OF A PROGRAM IN A First, users have to profile the application, and the applica- 

MEMORY tion has to be re-compiled or re-linked to take advantage of 

the optimal placement. Second, even with profiling, there is 
FIELD OF THE INVENTION 5 no guarantee that the statistics will reflect different uses of 
This invention relates generally to placing procedures of me application, 
software programs in a memory during the execution of The second problem is especially interesting in the case of 
instructions of the programs, and more particularly to plac- an application such as the Oracle database server. The 
ing the procedures so as to minimize misses in a cache Oracle program can be used for on-line transaction process- 
memory. 10 ing (OLTP), as well for decision support systems (DSS). 

These different uses might possibly have totally different 

BACKGROUND OF THE INVENTION execution profiles, consequently, an instruction layout for 

In computer systems, it is desired to improve Lhe perfor- one use may be totally unsuitable for the other. Optimally 

mance of large production applications such as database satisfying both needs would require different versions of the 

servers. These applications often have a huge number of 15 application. 

features and include large numbers of instructions in single In one prior art simulated approach, instructions are 

executable images of the applications. For example, an dynamically placed in the memory by a dynamic loader, 

executable image for the Oracle 7.3 database application This approach is described by Chen et al. in "Improving 

includes about nine MegaBytes of instructions. Instruction Locality with Just-In-Time Code Layout", The 

In addition, many of these large applications are known to 20 1997 USENIX Windows NT Workshop, August 1997. 

have a large instruction "footprint" — that is, a large number There, a dynamic layout approach for reducing instruction 

of instructions are used repeatedly during the execution of cache conflicts is evaluated. However, the approach is only 

the application. The large instruction footprint causes these a simulation without any actual implementations in a real 

applications to incur a large number of instruction cache system. 

capacity misses, particularly in the relatively small first- 25 The simulation avoids a number of several hard problems 

level instruction cache typically co-located on the same that must be solved for practical systems. A particular 

semiconductor chip with the processor. Even worse, a call- problem that will be encountered during practical dynamic 

ing procedure may overlap with a called procedure in a instruction placement is the use of indirect procedure calls, 

cache. In that case, there is direct conflict. ' for example, the use of function pointers in the C program- 

Because of the cache misses, a large percentage of an 30 ming language. Because the initialization and utility routines 

application's execution time can be consumed by processor for C programs typically make use of function pointers in a 

stalls while missing instructions are fetched from more modem operating system, the prior art approach would not 

distant levels of memory. For example, when the Oracle work in current systems, even if the actual application does 

application is executing the TPC benchmarks on Digital 35 not produce or use procedure pointers. 

Equipment Corporation's AlphaServer 4100 multi- There is also a problem with intercepting calls to proce- 

processor, instruction stalls can account for up to 30% of the dures that have not yet been placed in memory. According 

execution time. to Chen, the entry points of the procedures would be 

In the prior art, instruction stalls due to cache misses have replaced by small "thunks" of instructions. These instruc- 

been reduced by controlling where the instructions are ^ tions would effectively "jump" to the dynamic loader when 

placed in memory. By careful placement of instruction, the procedures are called. Patching the "thunks" requires an 

caches can take advantage of spatial localities. Most systems extra pass over the entire program at program startup to 

have made a static determination for the layout of instruc- install the thunks, or the stored copy of the executable image 

tions at compile time or link time. has to be changed. 

For instance, linkers have been used to construct a call 45 There would be a considerable run-time overhead should 

graph of the application. The call graph indicates how the this simulation ever be implemented in a real system, and the 

procedures are called. Procedures that call each other can be behavior of this method in a complex large-scale database 

placed together in the memory. If procedures are placed application is difficult to predict. 

close to each other, cache conflicts are minimized, even in a Therefore, there is a need for an efficient and practical 

direct-mapped instruction cache. Furthermore, if the size of 50 method that can dynamically place instructions of an appli- 

the instruction cache is known, then procedures can be cation program in a memory of a computer system while the 

placed in a non-conflicting manner, even when they are not program is executing so as to reduce conflicts of the instruc- 

immediately adjacent to each other, which may not always tions in caches, 
be possible. 

However, statically determining an optimal placement for 55 

procedures is difficult. In many cases, the actual execution The invention provides a method for placing procedures 

flow may depend on the run- time environment, such as the of an application program in a memory. A stored copy of the 

workload (data) that is processed. Usually, the workload application program is first mapped to non-executable 

cannot be determined statically. In addition, cache conflicts addresses of the memory. A segment of the memory large 

need only be avoided for procedures that call each other 60 enough to store the program is allocated as executable. The 

often. A procedure that is called only occasionally does not procedures are then copied from the mapped non-executable 

require optimal placement. addresses to the allocated executable segment in the order 

Many systems have used profile information to guide the that the procedures are called to dynamically place the 

layout of instructions at link time. Profile information is procedures of the program in the memory, 

statistical data that are collected during execution of the 65 In one aspect of the invention, a starting procedure of the 

application. The statistics can identify the relative frequency program is copied to the segment by a loader, and all 

at which procedures are called. procedure calls in the starting procedure are modified to be 
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procedure calls to the loader. The order in which the pro- 
cedures are called determines the order in which the proce- 
dures are placed in the segment memory, so that procedures 
that call each other are placed adjacent to each other. After 
the called procedures have been placed in the segment, the 
destination addresses of procedure calls are modified to 
indicate the new locations of the procedures. 

If a particular procedure attempts to call another proce- 
dure via a pointer that is a reference to an address in the 
original non-executable segment, then a signal is generated 
to call the loader to copy the required procedure. As proce- 
dures are copied and placed, procedure calls via jump 
instruction calls using absolute addresses are converted to 
calls using branch instruction calls using relative addresses 
where possible. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram of a memory allocated accord- 
ing to the invention; and 

FIG. 2 is a flow diagram of the procedure placement 
according to the invention. 

DETAILED DESCRIPTION OF PREFERRED 
EMBODIMENTS 

Introduction 

A method for dynamically placing procedures of an 
application program in a memory of a computer system 
proceeds as follows. A stored copy of the application pro- 
gram is first mapped to a non-executable virtual address 
space. In addition, an empty executable "text" segment is 
allocated for the program by a dynamic loader. The allocated 
segment is large enough to store the program. The term text 
is generally used to refer to the instructions of the program. 

The dynamic loader copies the procedures from the 
non-executable address space to the allocated executable 
text segment in the order that the procedures are called. As 
the procedures are copied, all procedure calls are replaced by 
calls to the dynamic loader. In addition, the loader performs 
any necessary changes ("relocations") in the instructions of 
the procedures because the procedures have been copied to 
new locations. As the program executes, the dynamic loader 
will be invoked when a procedure attempts to call another 
procedure that has not yet been placed in the text segment. 

The loader places the called procedures in the text seg- 
ment in the order that they are called. The destination 
addresses in the procedure call instructions of the calling 
procedure are replaced to now call the procedure at its new 
location in the text segment, and execution continues. That 
is, destination addresses of calls are adjusted taking into 
consideration the current program counter (PC) and other 
registers such as the global pointer (GP) register, and the 
new locations of the procedures. 

Eventually, all called procedures will be placed. Note, any 
procedures that are never called during a particular 
execution, i.e., "dead code," will not be copied. In other 
words, the method only copies procedures that are actually 
used. Large applications may include many procedures that 
are not needed for a particular execution. 
Indirect Procedure Calls 

The main complexity concerns how to handle indirect 
procedure calls. The application program may generate data 
that are procedure pointers. That is, the data are a destination 
address of a procedure that is computed as the program 
executes. The data are used as an operand in, for example, 
a computed "jump to subroutine" (jsr) instruction on Digi- 
tal's Alpha processors. 

Because the destination address of procedure pointers 
may change for successive executions of the procedure call, 
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it is not possible to locate and change all procedure pointers 
to particular destinations when the procedures are placed in 
memory. Therefore, a call could be to the wrong destination, 
the procedure's old location. 

5 The solution is to initially map the entire application to an 
address space of the memory that is allocated as non- 
executable before the application is executed. In the UNIX 
operating system, this can be done using the "mmap( )" 
function. If the application is mapped in such a manner, then 

10 the dynamic loader is invoked by a signal when the appli- 
cation attempts to call a procedure that has a destination 
address in the non-executable address space, i.e., calls a 
procedure using an "old" destination address or pointer. 
Layout of Address Space in Memory 

15 FIG. 1 shows how the address space of a memory 100 of 
a computer system is allocated to allow dynamic procedure 
placement according to the invention. The memory 100 uses 
64-bit virtual addresses as are found on Digital Equipment 
Corporation's Alpha processors. It should be understood that 

20 the invention is equally applicable to other addressing 
mechanisms. In FIG. 1, the arrows indicate the direction of 
expansion for the various allocated regions. Because of the 
virtual address space, large regions of memory can liberally 
be allocated to ensure no overflow or underflows during 

25 execution. 

Some of the memory 100 is allocated for the stack 110, 
data 120, and the "malloc" heap 130. These allocations are 
conventional. A stored copy of the application program, as 
produced by a linker, is mapped to an original text segment 

30 140 in its normal location. As stated above, the segment 140 
is protected as non-executable. A new text segment 150 is 
also allocated and made executable. It is to this area that the 
dynamic loader will copy the procedures as they are called 
during execution. 

35 Note, in FIG. 1, this region is labeled as the "first" new 
segment 140. The present method can be restarted to 
dynamically place procedures in another ("second") text 
segment 160 during different phases of the executing 
application, perhaps depending on workload. This variation 

40 and its advantages are described in detail below. 

The starting procedure (A) 151 is placed at the beginning 
of the new text segment 150. In the Digital UNIX operating 
system, the starting procedure is the "_start( ) function. All 
procedure call instructions, for example, jsr and bsr instruc- 

45 tions on the Alpha processor, are replaced with calls to the 
loader. Execution flow is transferred to the starting proce- 
dure A 151. 

When procedure A calls the next procedure directly or 
indirectly, for example, procedure B 141, that has not yet 

50 been placed in the new text segment 150, the loader will be 
invoked to place the called procedure next to the calling 
procedure A in the new text segment 150. The procedure call 
instruction in the calling procedure is also updated. If 
procedure A originally called procedure B via a direct 

55 procedure call, e.g., a bsr instruction on a Digital Alpha 
processor, then the call is updated so that procedure A calls 
procedure B directly at its new executable address. If 
procedure A calls procedure B via an indirect procedure call, 
e.g., a jsr instruction, then the original indirect call is 

60 restored, because procedure A may call a number of different 
procedures via the indirect call. All known procedure point- 
ers to procedure B in the application program are updated to 
point to the new address of procedure B. 

A signal (or trap) will be generated when procedure A 151 

65 attempt to call a procedure in the "old" original non- 
executable text segment 140 via an indirect call using an 
"old" address. Whenever the loader is invoked by the signal, 
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this is an indication that an indirect procedure call has been 
made via a procedure pointer that has not yet been updated, 
i.e., the pointer stores the "old" address of the procedure. In 
that case, the following steps are performed. 

Based on the operands of the calling instruction and the 
contents of the processor registers, the loader determines the 
destination address of the indirect procedure call. If the 
called procedure has not already been placed in the new text 
segment 150, then the loader copies the called procedure to 
the current end of the previously placed procedure, and 
replaces all procedure calls as described above. Otherwise, 
if the called procedure has been placed in the new text 
segment 150, then the loader uses the entry point address 
(destination address) of the called procedure. 

The dynamic layout according to the invention yields 
better performance because of fewer cache misses. In 
addition, the present method yields better performance in the 
instruction look- aside buffer (TLB) because the active 
instructions are spread over fewer virtual memory pages. It 
should be understood that confining the spatial locality of 
frequently executed procedures will have better perfor- 
mance in any cached memory, on-chip, off-chip, or even 
paged dynamic random access memory, i.e., main memory. 
JSR-to-BSR Conversion 

The basic method for procedure placement according to 
the invention can further be enhanced to take advantage of 
the dynamic layout. Relatively large applications must often 
use a "jsr" type of instructions (a jump to a specific address) 
to implement a procedure call that cannot be reach by a "bsr" 
type of instruction (a branch to a relative address) because 
the called procedure is beyond the range that can be speci- 
fied as a relative address. 

Because the present method copies calling procedures 
next to each other and only copies executed procedures, less 
memory is used, and many of the jsr instructions can be 
replaced by bsr instructions (jsr-to-bsr conversion). When 
the jsr-to-bsr conversion occurs, extra instructions that 
update global pointer (GP) register values can be eliminated 
or bypassed. In addition, the jsr-to-bsr conversion can elimi- 
nate any instructions that are used to load the address of the 
called procedure into a register. 
Instruction Hints 

Some processors, such as Digital's Alpha processors, 
allow the programmer or compiler to set a "hint" for indirect 
jump instructions. The dynamic loader can set the hint on an 
unconverted jsr instruction to be the address of the likely 
called procedure, e.g., the procedure that is first called by the 
jsr, because the loader is invoked on the first call. This step 
can contribute to better performance even for jsr instructions 
that cannot be converted, because this is something that 
cannot be done by a compiler. Often applications are 
designed so that for a computed call (a source program 
"case" statement), the default case (procedure) is executed 
over and over, and other procedures are only called as rare 
exceptions. 

Application Phase Dependent Procedure Placement 

Another variation of the present method allows the place- 
ment of procedures to be restarted. Here, procedure place- 
ment as described above is initially performed in the first 
new text segment 150. Once all procedures have been 
placed, the application is allowed to execute for a predeter- 
mined time, or until a particular stage (procedure) is 
reached. For example, the application may be allowed to 
execute until the end of the "initialization" phase of the 
application. At this point, when the application is ready to 
process the "main" workload, the dynamic procedure place- 
ment is restarted. An additional "second" new text segment 
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160 is allocated and the first segment 150 is now mapped 
non-executable. Dynamic procedure placement is restarted 
with the second "new" text segment 160. During this phase, 
the segment 150 is treated in the same manner as the original 

5 text segment. Note that indirect procedure calls to addresses 
in the original text segment 140 or the new text segment 150 
are correctly handled during the restart because both of the 
segments are protected as non-executable. 
Here the advantage is that the later layout will likely be 

aQ better, because now the application is executing the "core" 
or main set of procedures that will determine the applica- 
tion's performance. The core procedures are probably dif- 
ferent than the ones used during initialization of the appli- 
cation. 

However, when restarting a dynamic layout, return 

15 addresses found in the stack reference the first (now "old") 
new text 150. This problem of the stack storing the "old" 
return address is solved by now making the first text segment 
150 non-executable as was done for the original text seg- 
ment 140. When a procedure does a return using the address 

20 in the stack for the segment 150, a signal will be generated 
because the instructions at the old address cannot be 
executed. Again, the loader will be invoked. The loader 
copies the procedure to the second new text segment 160 
when necessary and execution of the application is contin- 

25 ued in the second new text segment 160. 
Profiling During Procedure Placement 

In this variation, the application is profiled while execut- 
ing. In this mode, the calls to the dynamic loader are not 
replaced as procedures are copied. Consequently, the loader 

30 is invoked for each procedure call, and a call graph is 
constructed. The call graph contains information indicating 
how often procedures call each other, i.e., the call graph 
indicates call "frequency." 

At the end of the profiling period, "important" (frequently 

35 called) procedures are identified, and these procedures are 
copied to the new text segment at optimal locations using the 
call graph information. Here, optimal means minimizing 
cache conflicts. The remaining "less important" procedures 
are dynamically placed during execution as described above. 

40 This hybrid static/dynamic placement can improve over a 
purely dynamic placement because the important procedures 
can be placed to eliminate any possibility of cache conflicts. 
In the case where conflicts still exist, a copy (clone) of a 
conflicting procedure can be placed elsewhere. 

45 The restarts and profiling can be invoked by adaptive 
time-related conditions. For example, restarts and/or profil- 
ing can periodically be invoked for a long running 
application, perhaps, once per hour. If the layout after restart 
is similar to the previous layout, then the restart interval can 

50 be increased. If the procedure "mix" in the new layout is 
different, restarts can be invoked on a more frequent basis. 
Method Steps 

FIG. 2 shows the method steps 200 according to the 
invention. In step 210, the stored copy of the application 

55 program is mapped to non-executable memory, and an 
empty executable segment large enough to store the program 
is allocated. In step 220, the first executable procedure, i.e., 
the "start" procedure, is copied to the new segment. Step 230 
transfers execution to the start procedure. Step 240 traps any 

60 calls to the non-executable segment back into the loader. In 
step 250, the loader copies the called procedure from the 
non-executable segment to the executable segment. Step 260 
restores the trapped call instructions, and modifies known 
procedure pointers. Execution resumes in step 270. 

65 Advantages 

The present invention transparently improves the perfor- 
mance of an application program by dynamically placing 
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procedures in a memory so that cache misses are reduced. 
The method is transparent in that the stored image of the 
application is never modified, and also the relocation of the 
procedures requires no user intervention, other than invok- 
ing the application by the dynamic loader. In addition, only 5 
executed procedures are placed so that memory require- 
ments are reduced. Dynamic placement can also reduce the 
number of executed instructions due to instruction conver- 
sions. Other optimizations result in fewer instruction TLB 
faults, and reduced overhead due to calling conventions 1Q 
between procedures. 

Measured results indicate that the present invention pro- 
vides an immediate 10% improvement in the performance of 
the Oracle 7.3 database running a number of TPC-D bench- 
marks on a Digital Alpha 4100 server. Other applications ^ 
experienced similar improvements. The invention may also 
improve the aggregate performance of a processor by reduc- 
ing the size of the instruction working set of an application. 

The foregoing description has been directed to specific 
embodiments of this invention. It will be apparent, however, 
that variations and modifications may be made to the 
described embodiments, with the attainment of all or some 
of the advantages. Therefore, it is the object of the appended 
claims to cover all such variations and modifications as 
come within the spirit and scope of the invention. 

I claim: 

1. A computerized method for placing procedures of a 
program in a memory, comprising the steps of: 

mapping the program to non-executable addresses of the 
memory; 30 

allocating a segment of the memory as executable; 

copying one or more of the procedures from the non- 
executable addresses to the segment as the procedures 
are executed to dynamically place the procedures of the 
program in memory; 35 

while executing the procedures of the program, generat- 
ing a trap when the executing procedure calls a proce- 
dure mapped to a non-executable address of the non- 
executable addresses; and 

responding to the trap by copying the called procedure to 40 
the segment when the called procedure has not previ- 
ously been copied to the segment and invoking the 
called procedure at its location in the segment. 

2. The method of claim 1 including copying a starting 
procedure of the program to the segment by a loader and 45 
modifying destination addresses of procedure calls in the 
starting procedure to be procedure calls to the loader. 

3. The method of claim 1 wherein an order in which the 
procedures are called determines an order in which the 
procedures are copied to the segment to store a called 50 
procedure adjacent to a calling procedure. 

4. The method of claim 2 including converting a proce- 
dure call via a jump instruction to a procedure call via a 
branch instruction. 

5. The method of claim 2 including providing a hint for 55 
the destination address. 

6. The method of claim 1 including repeating the 
mapping, allocating and copying steps for a different seg- 
ment of the memory after the procedures of the program 
have been placed. 60 

7. The method of claim 6 including generating a signal to 
call the loader if the application attempts to return to a 
procedure at a non-executable address. 

8. The method of claim 7 including a profiling executing 
program to construct a call graph with profile information, 65 
and repealing the mapping, allocating and copying steps 
according to profile information in the call graph. 
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9. The method of claim 8 including statically placing a 
first set of procedures that are called most frequently, and 
dynamically placing a second set of procedures that are 
called least frequently. 

10. The method of claim 6 including repeating according 
to adaptive time-related conditions. 

11. The method of claim 1, wherein the responding step 
includes updating a stored procedure pointer in the execut- 
ing program to a value corresponding to an address of the 
called procedure in the segment. 

12. The method of claim 1, wherein the responding step 
includes, when the procedure call in the executing procedure 
that caused the trap to be generated is an indirect procedure 
call, updating a stored procedure pointer in the executing 
program to a value corresponding to an address of the called 
procedure in the segment. 

13. The method of claim 1, wherein the copying step 
includes, modifying destination addresses of direct proce- 
dure calls in the copied procedures from the destination 
addresses of originally called procedures to a destination 
address of a loader procedure to form modified destination 
procedure calls; wherein the destination addresses of the 
originally called procedures are a subset of the non- 
executable addresses; 

the method including: 

executing the loader procedure when any one of the 
modified direct procedure calls is executed, the loader 
procedure executing step including identifying the 
originally called procedure associated with the modi- 
fied destination procedure call being executed; 

copying the identified originally called procedure to the 
segment if the identified originally called procedure has 
not previously been copied to the segment; and 

replacing the modified direct procedure call to the loader 
procedure with a new direct procedure call to the 
originally called procedure at its location in the seg- 
ment. 

14. The method of claim 13, wherein the copying step 
performed when executing the loader procedure includes 
modifying destination addresses of direct procedure calls in 
the originally called procedure from the destination 
addresses of respective originally called procedures to the 
destination address of the loader procedure to form modified 
destination procedure calls. 

15. A computer program product for use in conjunction 
with a computer system, the computer program product 
comprising a computer readable storage medium and a 
computer program mechanism embedded therein, the com- 
puter program mechanism comprising: 

a runtime module for executing a program having a 
plurality of procedures, the runtime module including 
instructions for: 

mapping the program to non-executable addresses of 
the memory; 

allocating a segment of the memory as executable; and 
copying one or more of the procedures from the non- 
executable addresses to the segment as the proce- 
dures are executed to dynamically place the proce- 
dures of the program in memory; 
the runtime module further including a trap procedure, 
which is to be automatically invoked when the execut- 
ing procedure calls a procedure mapped to a non- 
executable address of the non-executable addresses, the 
trap procedure including instructions for copying the 
called procedure to the segment when the called pro- 
cedure has not previously been copied to the segment 
and invoking the called procedure at its location in the 
segment. 



03/20/2004, EAST Version: 1.4.1 



US 6,240,500 Bl 



10 



16. The computer program product of claim 15 wherein 
the instructions for copying include instructions for copying 
a starting procedure of the program to the segment by a 
loader and modifying destination addresses of procedure 
calls in the starting procedure to be procedure calls to the 5 
loader. 

17. The computer program product of claim 15 wherein 
the instructions for copying include instructions for convert- 
ing a procedure call via a jump instruction to a procedure 
call via a branch instruction, 10 

18. The computer program product of claim 15 wherein 
the segment of the memory is a first segment and the runtime 
module includes instructions for repeating the mapping, 
allocating and copying to a second segment of the memory 
after at least a subset of the procedures of the program have 15 
been placed. 

19. The computer program product of claim 18 wherein 
the trap procedure is configured to be invoked whenever a 
procedure call is made to a procedure in the first segment. 

20. The computer program product of claim 18 including 
a profiling executing program to construct a call graph with 
profile information, and the runtime module includes 
instructions for repeating the mapping, allocating and copy- 
ing according to profile information in the call graph. 

21. The computer program product of claim 20, the 
runtime module including instructions for statically placing 
a first set of procedures that are called most frequently, and 
dynamically placing a second set of procedures that are 
called least frequently. 

22. The computer program product of claim 15 wherein 
the trap procedure includes instructions for updating a stored 
procedure pointer in the executing program to a value 
corresponding to an address of the called procedure in the 
segment. 

23. The computer program product of claim 15 wherein 35 
the trap procedure includes instructions, to be activated 
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when the procedure call in the executing procedure that 
caused the trap to be generated is an indirect procedure call, 
for updating a stored procedure pointer in the executing 
program to a value corresponding to an address of the called 
procedure in the segment. 

24. The computer program product of claim 15 wherein 
the instructions for copying include instructions for modi- 
fying destination addresses of direct procedure calls in the 
copied procedures from the destination addresses of origi- 
nally called procedures to a destination address of a loader 
procedure to form modified destination procedure calls; 
wherein the destination addresses of the originally called 
procedures are a subset of the non-executable addresses; 

the runtime module includes instructions for: 

executing the loader procedure when any one of the 
modified direct procedure calls is executed, the loader 
procedure executing step including identifying the 
originally called procedure associated with the modi- 
fied destination procedure call being executed; 

copying the identified originally called procedure to the 
segment if the identified originally called procedure has 
not previously been copied to the segment; and 

replacing the modified direct procedure call to the loader 
procedure with a new direct procedure call to the 
originally called procedure at its location in the seg- 
ment. 

25. The computer program product of claim 24 wherein 
the instructions for copying, performed when executing the 
loader procedure, include instructions for modifying desti- 
nation addresses of direct procedure calls in the originally 
called procedure from the destination addresses of respec- 
tive originally called procedures to the destination address of 
the loader procedure to form modified destination procedure 
calls. 
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